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1. INTRODUCTION 

Tomato (Solanum Lycopersicum), is one of the most widely consumed vegetable crops 
in Malaysia [1]. Tomato is regarded as a good nutritional source for vitamin C. However, tomato plants are 
prone to be infected by various diseases. Some of the disease pathogens are fungal organisms, while others 
are bacterial or viral [2]. To curb these diseases from spreading, various types of pesticides are used to kill 
the pathogens. The widespread use of these chemicals will pose harm towards human health as well as the 
environment [3]. Several popular tomato plant diseases are late blight, leaf mold, and mosaic virus. 
Late blight [4] is a potentially serious tomato disease caused by infestation of the phytophthora fungus. 
It causes lesions, which are small, dark and looks water-soaked spot. These leaf spots will quickly enlarge 
and a white mold will appear at the margins of the affected areas. Another fungus, the passalora fulva is the 
caused for leaf mold, which usually occurs on the older leaves that are closed to the soil where air circulation 
is poor and humidity is high. The initial symptom of this disease is pale green or yellowish spots on the upper 
leaf surface, which will enlarge and turn to be more distinctive yellow spots. The tomato mosaic virus [5] 
belongs to the tobamoviridae family, which is a pathogenic virus among the plants. The symptom can be 
observed at any stage of plant growth and it affects all parts of the plant. 

Recognition and classification of plant diseases play a vital role in the field of agriculture. 
The quality, quantity, and productivity of the plants depend on the timely detection of the diseases. 
Therefore, an automatic system needs to be developed so that the detection process can be made autonomous 
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with the minimal human intervention [6]. This system will help the farmers in diagnosing the disease at the 
early stage and allow them to perform mitigation actions before it is too late. One of the ways to detect the 
diseases is by using a deep learning approach using a supervised learning technique. Shadow occlusion 
problem is not considered, where it can easily be removed using shadow removal [7]. In this paper, 
the parameters of MobileNet V2 network is trained to classify tomato leaf diseases into their respective 
classes, before validation test is done to measure the effectiveness of the proposed system. In this paper, 
a subset of PlantVillage dataset of three types of tomato diseases and healthy leaves are downloaded from the 
Kaggle platform, which includes 373 tomato mosaic virus images, 1756 late blight images, 952 leaf mold 
images, and 1590 healthy tomato images [8]. The dataset is split into two groups of training and testing 
subsets. To meet the input requirements of the MobileNet V2 model, the input image size is rescaled to 
224x224 pixels. 


2. LITERATURE REVIEW 
2.1. Deep learning 

Deep learning is the state-of-the-art machine learning method, which utilizes a complex network of 
artificial nodes with large amounts of hidden layers. Many of the techniques before the introduction of deep 
learning classify the task through semantic features information. Some examples of the semantic features are 
corners, edges, shapes, etc. A deep learning approach does not require the design of features ahead of time. 
These features are the results of optimum automatic learning. Therefore, this method is robust to various 
modes in the data as these features are not handcrafted [9]. Some examples of the applications of deep 
learning approach are in object tracking [10-11], disease screening [12-13], physiotherapy [14], face retrieval 
[15], and remote sensing [16]. 


2.2. Development convolutional neural network (CNN) 
2.3.1. GoogleNet 

GoogLeNet is the winner of ILSVRC 2014, which has been proposed by [17]. The network allows 
parallel branches of convolutional neural networks with various kernel sizes. It contains 22 layers of network 
with more than 7,000,000 parameters. As for reference, AlexNet has only around 60,000,000 parameters, 
which is more than 10 times less number of trainable parameters. As a consequence, the complexity of 
GoogLeNet processing is also much lower than AlexNet's [18]. In general, GoogLeNet has also been proved 
to be consistently more accurate than AlexNet. 


2.3.2. Residual network (ResNet) 

ResNet [19] was first introduced in 2015, where it has also won ILSVRC competition with an error 
rate of 3.57%. ResNet's high accuracy rate can be mainly attributed to the introduction of residual layers that 
allow the network to be designed deeper compared to the previous popular network architectures [20]. 
The residual layer or also known as identity mapping mitigates the problem of diminishing gradient in 
training a deep network, where the previous layer is fed to the later layers. The idea was to overcome the 
reduction of input features from the original learning feature that produces zero features [21]. 


2.3.4. MobileNet V2 

MobileNet is a deep learning architecture that focuses on the mobile platform where the 
computational resource is limited. An improved version, which is called MobileNet V2 [22] is then 
introduced by Google with slight modifications to the original version. The basis of the network still remains 
the same, which is separable convolution. MobileNet version 2 previously trained on ImageNet datasets has 
been used to extract fruit image features in [23]. The paper claimed that the parameters used have reduced 
from 4.24 millions to just 3.47 millions, but with better accuracy. 


3. RESEARCH METHOD 
3.1. Disease detection using MobileNet 

MobileNet V2 is an improvement over MobileNet V1 [24]. Both of them still retain separable 
convolution as the core layer, where the number of parameters trained is much reduced compared to the full 
convolutional. The small requirement of the parameter number allows MobileNet V2 suitable for mobile 
phone applications, where the number of registers is much less compared to desktop. Separable convolution 
is divided into two distinct steps, which are depthwise convolution and pointwise convolution [25]. 
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3.1.1. Depthwise convolution 

Depthwise convolution is a reduced version of convolution, where each channel will undergo the 
process separately. An original convolution with height x width x channel input feature map is divided into 
several groups depending on the number of channels, which signifies the depth. The reduction in kernel size 
will also follow the same grouping. The depthwise convolution collects spatial features separately and thus 
the number of parameters needed also reduced significantly. 


3.3.2. Pointwise convolution 

Pointwise convolution is the opposite of the depthwise convolution. The width and height of the 
kernel are set to 1, but the depth will depend on the number of input channels. It will be cascaded right after 
depthwise convolution to represent the full convolution but with a much lesser number of parameters. It can 
be used to set the dimensions of the output channel features. 


3.4. The method based on MobileNet V2 

Figure 1 shows all components of the system where the PlantVillage dataset is used to train and test 
the proposed MobileNet V2-based tomate disease screening algorithm. 4671 images are extracted from the 
full dataset that includes 1590 images of healthy tomato leaves, 952 leaf mold images, 1756 late blight 
images, and 373 mosaic virus images. All leaves are captured individually without interference from other 
leaves as shown in Table 1. In these experiments, the images are randomly split into two subsets: 3471 
training images and 1200 testing images. To meet the input requirement of the MobileNet V2 model, all 
images are rescaled to 224 x 224 pixels. 


Table 1. Some samples of tomato leaf diseases 


Dataset 


Class Sample Image of tomato leaves 
; - ) ‘ , Healthy Leaf Reg vey, 
Classification of Splitting NY 
tomato leaf the data S 
| Performance | MobileNet v2 
Metrics 
Leaf Mold 
Training/ 
testing the data J 
Figure 1. Main components of the 
tomate diseases detection using 
MobileNet V2 Late Blight 


Mosaic Virus 


After that, MobileNet V2 is trained from scratch through random initialization or using transfer 
learning techniques. Subsequently, a training progress chart is plotted to demonstrate the performance 
MobileNet V2 in classifying the tomato diseases. The performance metric of accuracy is then used to 
measure classification performance, which is divided into four classes healthy tomato leaves, late blight, leaf 
mold, and mosaic virus. 


4. RESULTS AND DISCUSSION 

In this paper, experimental results are tested using HP Intel core 17-3770 @ 3.9GHz CPU with 8 GB 
memory. No graphic processing unit is utilized where normal CPU-based TensorFlow is implemented using 
the Python platform. Several hyper-parameter configurations of the MobileNet V2 are tested that include 
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batch size, optimizer selection and learning rate. The testing process is done sequentially where the best setup 
of each hyper-parameter is tested separately to find the optimal setting. 


4.1. Optimization method 

The following Table 2 and Figure 2 shows the classification results of five different optimizers, 
namely Adagrad, Adam, SGD, RMSprop and Nadam. Among these five optimizers, Adagrad optimizer gives 
the best accuracy of 0.9434, followed by Adam and SGD optimizers with accuracy of 0.8996 and 0.8558, 
respectively. RMSProp and Nadam return a low accuracy in classifying the tomato plant diseases. 


Table 2. Training and testing subset 25 
final accuracy and final loss of 20.5444 
various optimizers 20 
Optimizers Final Final Loss 
Accuracy is 
Adagrad 0.9434 0.1554 
Adam 0.8996 0.2587 
SGD 0.8558 0.4306 10 
RMSprop 0.6282 4.3899 
Nadam 0.0908 20.5444 
5 
0.1554 
0.0908 
0 0.9434 0.8996 0.8558 
Adagrad Adam SGD RMSprop Nadam 


e-eFinal Accuracy Final Loss 


Figure 2. Testing accuracy and final loss of various 
optimizer methods 


4.2. Learning rate 

Learning rate is a hyper-parameter that controls how much the gradient error will be used to update 
the current weights. In this test, Adam optimizer is selected as the core for the testing because of the 
noticeable difference once the rate is varied. Three rates are tested, which are 0.01, 0.001 and 0.0001 as 
shown in Table 3. The best-performed learning rate given the same number of training epoch is recorded 
when a rate of 0.001 is used to train the MobileNet V2. The results in Figure 3 shows that an accuracy of 
0.8996 is achieved when a learning rate of 0.001 is implemented. 


Table 3. Final accuracy and final loss 45 3.392 

of various learning rates ‘ 
Learning Final Final 

Rate Accuracy Loss 3.5 

0.00001 0.7585 0.5957 3 
0.0001 0.8996 0.2587 

0.001 0.7606 3.3920 25 

2 

15 

1 

0.5 

0 0.00001 


Learning Rate Final Accuracy Final Loss 


——Seriesl1 ——Series2 ——Series3 


Figure 3. Final accuracy and final loss for various learning rates 


4.3. Training and Testing Subset 

Training data is the dataset used to train the MobileNet V2 (weights and biases in the case of 
standard CNN), while testing data is the sample that is used to evaluate the performance of the trained 
network. Inspired by [26], four ways of data division as shown in Table 4 are explored between training and 
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testing data, which are 9:1, 4:1, 7:3 and 3:2 ratios. The best classification result of the tomato disease 
classification is obtained when a split ratio of 4:1 is used between training and testing data with an accuracy 
value of 0.9562 as shown in Figure 4. 


Table 4. Final accuracy and final loss between training and testing subset 


Training Testing Final Accuracy Final Loss 
4203 468 0.9083 0.3380 
3736 935 0.9562 0.1402 
3269 1402 0.7612 0.8120 
2802 1869 0.3866 0.8583 

4500 
4000 4203 
3500 3736 
3000 3269 
2802 
2500 
2000 
1402 1869 
1500 
1000 
468 
500 
i R 
0 0.9083 0.9562 0.7612 0.3866 
1 2 3 4 


—Training -=Testing =Final Accuracy 


Figure 4. Final accuracy and final loss between training and testing subset 


4.4. Batch Size 

Batch size is a hyperparameter that controls the number of images is fed to the network for one 
training iteration. It allows local analysis of several images instead of an individual image. Less fluctuation in 
training error is observed when a batch size method is used, but a too-large batch size will result in over- 
generalization. Three batch sizes are explored that includes 16, 32 and 48. Table 5 shows the classification 
accuracy when the batch size is increased from 16 to 48. The best performance is obtained when the batch 
size of 16 is used with an accuracy value of 0.9594. Figure 5 also reveals that accuracy is decreasing once the 
size is increased from 16 to 48. 


Table 5. Final accuracy and final loss of 60 
various batch sizes 


Batch Size Final Accuracy — Final Loss 50 
48 0.7970 0.6940 
32 0.8632 0.3912 40 
16 0.9594 0.1265 


20 


0.694 
0.3912 


0.1265 
Batch Size Final Accuracy Final Loss 


——Seriesl =—Series2 —=Series3 


Figure 5. Final accuracy and final loss between batch size 
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5. CONCLUSION 

In conclusion, MobileNet V2 has successfully implemented to classify various tomato plant diseases 
based on captured leaf images. The best classification performance is obtained when MobileNet V2 is trained 
using Adagrad with a batch size of 16. The experimental results also prove that a learning rate of 0.001 and 
data division of 4:1 ratio between training and testing deliver the most accurate classification performance. 
For future work, all classes in the PlantVillage will be explored instead of just three diseases. 
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